Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
medRxiv ; 2024 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-38410487

RESUMO

Summary: With the rapid growth of genetic data linked to electronic health record data in huge cohorts, large-scale phenome-wide association study (PheWAS), have become powerful discovery tools in biomedical research. PheWAS is an analysis method to study phenotype associations utilizing longitudinal electronic health record (EHR) data. Previous PheWAS packages were developed mostly in the days of smaller biobanks and with earlier PheWAS approaches. PheTK was designed to simplify analysis and efficiently handle biobank-scale data. PheTK uses multithreading and supports a full PheWAS workflow including extraction of data from OMOP databases and Hail matrix tables as well as PheWAS analysis for both phecode version 1.2 and phecodeX. Benchmarking results showed PheTK took 64% less time than the R PheWAS package to complete the same workflow. PheTK can be run locally or on cloud platforms such as the All of Us Researcher Workbench ( All of Us ) or the UK Biobank (UKB) Research Analysis Platform (RAP). Availability and implementation: The PheTK package is freely available on the Python Package Index (PyPi) and on GitHub under GNU Public License (GPL-3) at https://github.com/nhgritctran/PheTK . It is implemented in Python and platform independent. The demonstration workspace for All of Us will be made available in the future as a featured workspace. Contact: PheTK@mail.nih.gov.

2.
J Am Med Inform Assoc ; 31(4): 846-854, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38263490

RESUMO

IMPORTANCE: Knowledge gained from cohort studies has dramatically advanced both public and precision health. The All of Us Research Program seeks to enroll 1 million diverse participants who share multiple sources of data, providing unique opportunities for research. It is important to understand the phenomic profiles of its participants to conduct research in this cohort. OBJECTIVES: More than 280 000 participants have shared their electronic health records (EHRs) in the All of Us Research Program. We aim to understand the phenomic profiles of this cohort through comparisons with those in the US general population and a well-established nation-wide cohort, UK Biobank, and to test whether association results of selected commonly studied diseases in the All of Us cohort were comparable to those in UK Biobank. MATERIALS AND METHODS: We included participants with EHRs in All of Us and participants with health records from UK Biobank. The estimates of prevalence of diseases in the US general population were obtained from the Global Burden of Diseases (GBD) study. We conducted phenome-wide association studies (PheWAS) of 9 commonly studied diseases in both cohorts. RESULTS: This study included 287 012 participants from the All of Us EHR cohort and 502 477 participants from the UK Biobank. A total of 314 diseases curated by the GBD were evaluated in All of Us, 80.9% (N = 254) of which were more common in All of Us than in the US general population [prevalence ratio (PR) >1.1, P < 2 × 10-5]. Among 2515 diseases and phenotypes evaluated in both All of Us and UK Biobank, 85.6% (N = 2152) were more common in All of Us (PR >1.1, P < 2 × 10-5). The Pearson correlation coefficients of effect sizes from PheWAS between All of Us and UK Biobank were 0.61, 0.50, 0.60, 0.57, 0.40, 0.53, 0.46, 0.47, and 0.24 for ischemic heart diseases, lung cancer, chronic obstructive pulmonary disease, dementia, colorectal cancer, lower back pain, multiple sclerosis, lupus, and cystic fibrosis, respectively. DISCUSSION: Despite the differences in prevalence of diseases in All of Us compared to the US general population or the UK Biobank, our study supports that All of Us can facilitate rapid investigation of a broad range of diseases. CONCLUSION: Most diseases were more common in All of Us than in the general US population or the UK Biobank. Results of disease-disease association tests from All of Us are comparable to those estimated in another well-studied national cohort.


Assuntos
Fenômica , Saúde da População , Humanos , Bancos de Espécimes Biológicos , 60682 , Fenótipo , Reino Unido/epidemiologia
3.
J Am Med Inform Assoc ; 31(1): 139-153, 2023 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-37885303

RESUMO

OBJECTIVE: The All of Us Research Program (All of Us) aims to recruit over a million participants to further precision medicine. Essential to the verification of biobanks is a replication of known associations to establish validity. Here, we evaluated how well All of Us data replicated known cigarette smoking associations. MATERIALS AND METHODS: We defined smoking exposure as follows: (1) an EHR Smoking exposure that used International Classification of Disease codes; (2) participant provided information (PPI) Ever Smoking; and, (3) PPI Current Smoking, both from the lifestyle survey. We performed a phenome-wide association study (PheWAS) for each smoking exposure measurement type. For each, we compared the effect sizes derived from the PheWAS to published meta-analyses that studied cigarette smoking from PubMed. We defined two levels of replication of meta-analyses: (1) nominally replicated: which required agreement of direction of effect size, and (2) fully replicated: which required overlap of confidence intervals. RESULTS: PheWASes with EHR Smoking, PPI Ever Smoking, and PPI Current Smoking revealed 736, 492, and 639 phenome-wide significant associations, respectively. We identified 165 meta-analyses representing 99 distinct phenotypes that could be matched to EHR phenotypes. At P < .05, 74 were nominally replicated and 55 were fully replicated. At P < 2.68 × 10-5 (Bonferroni threshold), 58 were nominally replicated and 40 were fully replicated. DISCUSSION: Most phenotypes found in published meta-analyses associated with smoking were nominally replicated in All of Us. Both survey and EHR definitions for smoking produced similar results. CONCLUSION: This study demonstrated the feasibility of studying common exposures using All of Us data.


Assuntos
Estudo de Associação Genômica Ampla , Saúde da População , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único , Fumar
4.
mSphere ; 7(5): e0025722, 2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36173112

RESUMO

Accurate, highly specific immunoassays for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are needed to evaluate seroprevalence. This study investigated the concordance of results across four immunoassays targeting different antigens for sera collected at the beginning of the SARS-CoV-2 pandemic in the United States. Specimens from All of Us participants contributed between January and March 2020 were tested using the Abbott Architect SARS-CoV-2 IgG (immunoglobulin G) assay (Abbott) and the EuroImmun SARS-CoV-2 enzyme-linked immunosorbent assay (ELISA) (EI). Participants with discordant results, participants with concordant positive results, and a subset of concordant negative results by Abbott and EI were also tested using the Roche Elecsys anti-SARS-CoV-2 (IgG) test (Roche) and the Ortho-Clinical Diagnostics Vitros anti-SARS-CoV-2 IgG test (Ortho). The agreement and 95% confidence intervals were estimated for paired assay combinations. SARS-CoV-2 antibody concentrations were quantified for specimens with at least two positive results across four immunoassays. Among the 24,079 participants, the percent agreement for the Abbott and EI assays was 98.8% (95% confidence interval, 98.7%, 99%). Of the 490 participants who were also tested by Ortho and Roche, the probability-weighted percentage of agreement (95% confidence interval) between Ortho and Roche was 98.4% (97.9%, 98.9%), that between EI and Ortho was 98.5% (92.9%, 99.9%), that between Abbott and Roche was 98.9% (90.3%, 100.0%), that between EI and Roche was 98.9% (98.6%, 100.0%), and that between Abbott and Ortho was 98.4% (91.2%, 100.0%). Among the 32 participants who were positive by at least 2 immunoassays, 21 had quantifiable anti-SARS-CoV-2 antibody concentrations by research assays. The results across immunoassays revealed concordance during a period of low prevalence. However, the frequency of false positivity during a period of low prevalence supports the use of two sequentially performed tests for unvaccinated individuals who are seropositive by the first test. IMPORTANCE What is the agreement of commercial SARS-CoV-2 immunoglobulin G (IgG) assays during a time of low coronavirus disease 2019 (COVID-19) prevalence and no vaccine availability? Serological tests produced concordant results in a time of low SARS-CoV-2 prevalence and no vaccine availability, driven largely by the proportion of samples that were negative by two immunoassays. The CDC recommends two sequential tests for positivity for future pandemic preparedness. In a subset analysis, quantified antinucleocapsid and antispike SARS-CoV-2 IgG antibodies do not suggest the need to specify the antigen targets of the sequential assays in the CDC's recommendation because false positivity varied as much between assays targeting the same antigen as it did between assays targeting different antigens.


Assuntos
COVID-19 , Saúde da População , Humanos , SARS-CoV-2 , COVID-19/diagnóstico , COVID-19/epidemiologia , Prevalência , Estudos Soroepidemiológicos , Sensibilidade e Especificidade , Anticorpos Antivirais , Imunoglobulina G
5.
Patterns (N Y) ; 3(8): 100570, 2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-36033590

RESUMO

The All of Us Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools.

6.
Clin Infect Dis ; 74(4): 584-590, 2022 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-34128970

RESUMO

BACKGROUND: With limited severe acute respiratory syndrome coronavirus (SARS-CoV-2) testing capacity in the United States at the start of the epidemic (January-March 2020), testing was focused on symptomatic patients with a travel history throughout February, obscuring the picture of SARS-CoV-2 seeding and community transmission. We sought to identify individuals with SARS-CoV-2 antibodies in the early weeks of the US epidemic. METHODS: All of Us study participants in all 50 US states provided blood specimens during study visits from 2 January to 18 March 2020. Participants were considered seropositive if they tested positive for SARS-CoV-2 immunoglobulin G (IgG) antibodies with the Abbott Architect SARS-CoV-2 IgG enzyme-linked immunosorbent assay (ELISA) and the EUROIMMUN SARS-CoV-2 ELISA in a sequential testing algorithm. The sensitivity and specificity of these ELISAs and the net sensitivity and specificity of the sequential testing algorithm were estimated, along with 95% confidence intervals (CIs). RESULTS: The estimated sensitivities of the Abbott and EUROIMMUN assays were 100% (107 of 107 [95% CI: 96.6%-100%]) and 90.7% (97 of 107 [83.5%-95.4%]), respectively, and the estimated specificities were 99.5% (995 of 1000 [98.8%-99.8%]) and 99.7% (997 of 1000 [99.1%-99.9%]), respectively. The net sensitivity and specificity of our sequential testing algorithm were 90.7% (97 of 107 [95% CI: 83.5%-95.4%]) and 100.0% (1000 of 1000 [99.6%-100%]), respectively. Of the 24 079 study participants with blood specimens from 2 January to 18 March 2020, 9 were seropositive, 7 before the first confirmed case in the states of Illinois, Massachusetts, Wisconsin, Pennsylvania, and Mississippi. CONCLUSIONS: Our findings identified SARS-CoV-2 infections weeks before the first recognized cases in 5 US states.


Assuntos
COVID-19 , Saúde da População , Anticorpos Antivirais , COVID-19/diagnóstico , Ensaio de Imunoadsorção Enzimática , Humanos , Imunoglobulina G , SARS-CoV-2 , Sensibilidade e Especificidade
7.
J Biomed Inform ; 98: 103270, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31445983

RESUMO

OBJECTIVE: Discovering subphenotypes of complex diseases can help characterize disease cohorts for investigative studies aimed at developing better diagnoses and treatments. Recent advances in unsupervised machine learning on electronic health record (EHR) data have enabled researchers to discover phenotypes without input from domain experts. However, most existing studies have ignored time and modeled diseases as discrete events. Uncovering the evolution of phenotypes - how they emerge, evolve and contribute to health outcomes - is essential to define more precise phenotypes and refine the understanding of disease progression. Our objective was to assess the benefits of an unsupervised approach that incorporates time to model diseases as dynamic processes in phenotype discovery. METHODS: In this study, we applied a constrained non-negative tensor-factorization approach to characterize the complexity of cardiovascular disease (CVD) patient cohort based on longitudinal EHR data. Through tensor-factorization, we identified a set of phenotypic topics (i.e., subphenotypes) that these patients established over the 10 years prior to the diagnosis of CVD, and showed the progress pattern. For each identified subphenotype, we examined its association with the risk for adverse cardiovascular outcomes estimated by the American College of Cardiology/American Heart Association Pooled Cohort Risk Equations, a conventional CVD-risk assessment tool frequently used in clinical practice. Furthermore, we compared the subsequent myocardial infarction (MI) rates among the six most prevalent subphenotypes using survival analysis. RESULTS: From a cohort of 12,380 adult CVD individuals with 1068 unique PheCodes, we successfully identified 14 subphenotypes. Through the association analysis with estimated CVD risk for each subtype, we found some phenotypic topics such as Vitamin D deficiency and depression, Urinary infections cannot be explained by the conventional risk factors. Through a survival analysis, we found markedly different risks of subsequent MI following the diagnosis of CVD among the six most prevalent topics (p < 0.0001), indicating these topics may capture clinically meaningful subphenotypes of CVD. CONCLUSION: This study demonstrates the potential benefits of using tensor-decomposition to model diseases as dynamic processes from longitudinal EHR data. Our results suggest that this data-driven approach may potentially help researchers identify complex and chronic disease subphenotypes in precision medicine research.


Assuntos
Doenças Cardiovasculares/diagnóstico , Registros Eletrônicos de Saúde , Informática Médica/métodos , Centros Médicos Acadêmicos , Algoritmos , Bases de Dados Factuais , Humanos , Infarto do Miocárdio/complicações , Fenótipo , Medicina de Precisão , Risco , Fatores de Risco , Sociedades Médicas , Estados Unidos , Aprendizado de Máquina não Supervisionado , Infecções Urinárias/complicações , Infecções Urinárias/diagnóstico , Deficiência de Vitamina D/complicações
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...